An entropy based segmentation algorithm for computer-generated document images

نویسندگان

  • Lijie Liu
  • Yan Dong
  • Xiaomu Song
  • Guoliang Fan
چکیده

This paper presents an efficient compression-oriented segmentation algorithm for computer-generated document images. In this algorithm, a document image is represented in a block-based multiscale pyramid. Then, image blocks will be characterized based on their entropy values of the intensity histogram, and the entropy distribution are assumed to be Gaussian priors in this work. We will discuss two methods, i.e., off-line and online training, to estimate model parameters. We use the multiscale Bayesian estimation to refine the classification results and generate the final segmentation result, where image blocks are classified into four classes, i.e., background, text, graphic and picture. It is expected that the proposed entropy-based segmentation will be suitable for compound document compression and two training approaches apply to different applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

A Novel Spot-Enhancement Anisotropic Diffusion Method for the Improvement of Segmentation in Two-dimensional Gel Electrophoresis Images, Based on the Watershed Transform Algorithm

Introduction Two-dimensional gel electrophoresis (2DGE) is a powerful technique in proteomics for protein separation. In this technique, spot segmentation is an essential stage, which can be challenging due to problems such as overlapping spots, streaks, artifacts and noise. Watershed transform is one of the common methods for image segmentation. Nevertheless, in 2DGE image segmentation, the no...

متن کامل

Segmentation Improvement of High Resolution Remote Sensing Images based on superpixels using Edge-based SLIC algorithm (E-SLIC)

The segmentation of high resolution remote sensing images is one of the most important analyses that play a significant role in the maximal and exact extraction of information.  There are different types of segmentation methods among which using  superpixels is one of the most important ones. Several methods have been proposed for extracting superpixels. Among the most successful ones, we can r...

متن کامل

Plant Classification in Images of Natural Scenes Using Segmentations Fusion

This paper presents a novel approach to automatic classifying and identifying of tree leaves using image segmentation fusion. With the development of mobile devices and remote access, automatic plant identification in images taken in natural scenes has received much attention. Image segmentation plays a key role in most plant identification methods, especially in complex background images. Wher...

متن کامل

A New Method for Sperm Detection in Infertility Cure: Hypothesis Testing Based on Fuzzy Entropy Decision

In this paper, a new method is introduced for sperm detection in microscopic images for infertility treatment. In this method, firstly a hypothesis testing function is defined to separate sperms from plasma, non-sperm semen particles and noise. Then, some primary candidates are selected for sperms by watershed-based segmentation algorithm. Finally, candidates are either confirmed or rejected us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003